智能论文笔记

Evaluation of Machine Learning Techniques for Forecast Uncertainty Quantification

Maximiliano A. Sacco , Juan J. Ruiz , Manuel Pulido , Pierre Tandeo

分类：机器学习 | 人工智能

2021-11-29

生产精确的天气预报和不确定的不确定性的可靠量化是一个开放的科学挑战。到目前为止，集团预测是最成功的方法，以产生相关预测的方法以及估计其不确定性。集合预测的主要局限性是高计算成本，难以捕获和量化不同的不确定性来源，特别是与模型误差相关的源。在这项工作中，进行概念证据模型实验，以检查培训的ANN的性能，以预测系统的校正状态和使用单个确定性预测作为输入的状态不确定性。我们比较不同的培训策略：一个基于使用集合预测的平均值和传播作为目标的直接培训，另一个依赖于使用确定性预测作为目标的决定性预测，其中来自数据隐含地学习不确定性。对于最后一种方法，提出和评估了两个替代损失函数，基于数据观察似然和基于误差的本地估计来评估另一个丢失功能。在不同的交货时间和方案中检查网络的性能，在没有模型错误的情况下。使用Lorenz'96模型的实验表明，ANNS能够模拟集合预测的一些属性，如最不可预测模式的过滤和预测不确定性的状态相关量化。此外，ANNS提供了在模型误差存在下的预测不确定性的可靠估计。

translated by 谷歌翻译

Assessment of creditworthiness models privacy-preserving training with synthetic data

Ricardo Muñoz-Cancino , Cristián Bravo , Sebastián A. Ríos , Manuel Graña

分类：机器学习

2022-12-31

Credit scoring models are the primary instrument used by financial institutions to manage credit risk. The scarcity of research on behavioral scoring is due to the difficult data access. Financial institutions have to maintain the privacy and security of borrowers' information refrain them from collaborating in research initiatives. In this work, we present a methodology that allows us to evaluate the performance of models trained with synthetic data when they are applied to real-world data. Our results show that synthetic data quality is increasingly poor when the number of attributes increases. However, creditworthiness assessment models trained with synthetic data show a reduction of 3\% of AUC and 6\% of KS when compared with models trained with real data. These results have a significant impact since they encourage credit risk investigation from synthetic data, making it possible to maintain borrowers' privacy and to address problems that until now have been hampered by the availability of information.

translated by 谷歌翻译

Genetic-tunneling driven energy optimizer for magnetic system

Qichen Xu , Zhuanglin Shen , Manuel Pereiro , Pawel Herman , Olle Eriksson , Anna Delin

分类：神经与进化计算

2022-12-31

Novel topological spin textures, such as magnetic skyrmions, benefit from their inherent stability, acting as the ground state in several magnetic systems. In the current study of atomic monolayer magnetic materials, reasonable initial guesses are still needed to search for those magnetic patterns. This situation underlines the need to develop a more effective way to identify the ground states. To solve this problem, in this work, we propose a genetic-tunneling-driven variance-controlled optimization approach, which combines a local energy minimizer back-end and a metaheuristic global searching front-end. This algorithm is an effective optimization solution for searching for magnetic ground states at extremely low temperatures and is also robust for finding low-energy degenerated states at finite temperatures. We demonstrate here the success of this method in searching for magnetic ground states of 2D monolayer systems with both artificial and calculated interactions from density functional theory. It is also worth noting that the inherent concurrent property of this algorithm can significantly decrease the execution time. In conclusion, our proposed method builds a useful tool for low-dimensional magnetic system energy optimization.

translated by 谷歌翻译

Forecasting through deep learning and modal decomposition in multi-phase concentric jets

León Mata , Rodrigo Abadía-Heredia , Manuel Lopez-Martin , José M. Pérez , Soledad Le Clainche

分类：机器学习

2022-12-24

This work presents a set of neural network (NN) models specifically designed for accurate and efficient fluid dynamics forecasting. In this work, we show how neural networks training can be improved by reducing data complexity through a modal decomposition technique called higher order dynamic mode decomposition (HODMD), which identifies the main structures inside flow dynamics and reconstructs the original flow using only these main structures. This reconstruction has the same number of samples and spatial dimension as the original flow, but with a less complex dynamics and preserving its main features. We also show the low computational cost required by the proposed NN models, both in their training and inference phases. The core idea of this work is to test the limits of applicability of deep learning models to data forecasting in complex fluid dynamics problems. Generalization capabilities of the models are demonstrated by using the same neural network architectures to forecast the future dynamics of four different multi-phase flows. Data sets used to train and test these deep learning models come from Direct Numerical Simulations (DNS) of these flows.

translated by 谷歌翻译

Automatic Emotion Modelling in Written Stories

Lukas Christ , Shahin Amiriparian , Manuel Milling , Ilhan Aslan , Björn W. Schuller

分类：自然语言处理

2022-12-21

Telling stories is an integral part of human communication which can evoke emotions and influence the affective states of the audience. Automatically modelling emotional trajectories in stories has thus attracted considerable scholarly interest. However, as most existing works have been limited to unsupervised dictionary-based approaches, there is no labelled benchmark for this task. We address this gap by introducing continuous valence and arousal annotations for an existing dataset of children's stories annotated with discrete emotion categories. We collect additional annotations for this data and map the originally categorical labels to the valence and arousal space. Leveraging recent advances in Natural Language Processing, we propose a set of novel Transformer-based methods for predicting valence and arousal signals over the course of written stories. We explore several strategies for fine-tuning a pretrained ELECTRA model and study the benefits of considering a sentence's context when inferring its emotionality. Moreover, we experiment with additional LSTM and Transformer layers. The best configuration achieves a Concordance Correlation Coefficient (CCC) of .7338 for valence and .6302 for arousal on the test set, demonstrating the suitability of our proposed approach. Our code and additional annotations are made available at https://github.com/lc0197/emotion_modelling_stories.

translated by 谷歌翻译

Lessons from Robot-Assisted Disaster Response Deployments by the German Rescue Robotics Center Task Force

Hartmut Surmann , Ivana Kruijff-Korbayova , Kevin Daun , Marius Schnaubelt , Oskar von Stryk , Manuel Patchou , Stefan Boecker , Christian Wietfeld , Jan Quenzel , Daniel Schleich

分类：机器人

2022-12-19

Earthquakes, fire, and floods often cause structural collapses of buildings. The inspection of damaged buildings poses a high risk for emergency forces or is even impossible, though. We present three recent selected missions of the Robotics Task Force of the German Rescue Robotics Center, where both ground and aerial robots were used to explore destroyed buildings. We describe and reflect the missions as well as the lessons learned that have resulted from them. In order to make robots from research laboratories fit for real operations, realistic test environments were set up for outdoor and indoor use and tested in regular exercises by researchers and emergency forces. Based on this experience, the robots and their control software were significantly improved. Furthermore, top teams of researchers and first responders were formed, each with realistic assessments of the operational and practical suitability of robotic systems.

translated by 谷歌翻译

Smart Face Shield: A Sensor-Based Wearable Face Shield Utilizing Computer Vision Algorithms

Manuel Luis C. Delos Santos , Ronaldo S. Tinio , Darwin B. Diaz , Karlene Emily I. Tolosa

分类：计算机视觉

2022-12-18

The study aims the development of a wearable device to combat the onslaught of covid-19. Likewise, to enhance the regular face shield available in the market. Furthermore, to raise awareness of the health and safety protocols initiated by the government and its affiliates in the enforcement of social distancing with the integration of computer vision algorithms. The wearable device was composed of various hardware and software components such as a transparent polycarbonate face shield, microprocessor, sensors, camera, thin-film transistor on-screen display, jumper wires, power bank, and python programming language. The algorithm incorporated in the study was object detection under computer vision machine learning. The front camera with OpenCV technology determines the distance of a person in front of the user. Utilizing TensorFlow, the target object identifies and detects the image or live feed to get its bounding boxes. The focal length lens requires the determination of the distance from the camera to the target object. To get the focal length, multiply the pixel width by the known distance and divide it by the known width (Rosebrock, 2020). The deployment of unit testing ensures that the parameters are valid in terms of design and specifications.

translated by 谷歌翻译

Two-sample test based on Self-Organizing Maps

Alejandro Álvarez-Ayllón , Manuel Palomo-Duarte , Juan-Manuel Dodero

分类：机器学习 | 神经与进化计算

2022-12-17

Machine-learning classifiers can be leveraged as a two-sample statistical test. Suppose each sample is assigned a different label and that a classifier can obtain a better-than-chance result discriminating them. In this case, we can infer that both samples originate from different populations. However, many types of models, such as neural networks, behave as a black-box for the user: they can reject that both samples originate from the same population, but they do not offer insight into how both samples differ. Self-Organizing Maps are a dimensionality reduction initially devised as a data visualization tool that displays emergent properties, being also useful for classification tasks. Since they can be used as classifiers, they can be used also as a two-sample statistical test. But since their original purpose is visualization, they can also offer insights.

translated by 谷歌翻译

An automated parameter domain decomposition approach for gravitational wave surrogates using hp-greedy refinement

Franco Cerino , J. Andrés Diaz-Pace , Manuel Tiglio

分类：机器学习

2022-12-16

We introduce hp-greedy, a refinement approach for building gravitational wave surrogates as an extension of the standard reduced basis framework. Our proposal is data-driven, with a domain decomposition of the parameter space, local reduced basis, and a binary tree as the resulting structure, which are obtained in an automated way. When compared to the standard global reduced basis approach, the numerical simulations of our proposal show three salient features: i) representations of lower dimension with no loss of accuracy, ii) a significantly higher accuracy for a fixed maximum dimensionality of the basis, in some cases by orders of magnitude, and iii) results that depend on the reduced basis seed choice used by the refinement algorithm. We first illustrate the key parts of our approach with a toy model and then present a more realistic use case of gravitational waves emitted by the collision of two spinning, non-precessing black holes. We discuss performance aspects of hp-greedy, such as overfitting with respect to the depth of the tree structure, and other hyperparameter dependences. As two direct applications of the proposed hp-greedy refinement, we envision: i) a further acceleration of statistical inference, which might be complementary to focused reduced-order quadratures, and ii) the search of gravitational waves through clustering and nearest neighbors.

translated by 谷歌翻译

Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Manuel Brenner , Georgia Koppe , Daniel Durstewitz

分类：机器学习

2022-12-15

Many, if not most, systems of interest in science are naturally described as nonlinear dynamical systems (DS). Empirically, we commonly access these systems through time series measurements, where often we have time series from different types of data modalities simultaneously. For instance, we may have event counts in addition to some continuous signal. While by now there are many powerful machine learning (ML) tools for integrating different data modalities into predictive models, this has rarely been approached so far from the perspective of uncovering the underlying, data-generating DS (aka DS reconstruction). Recently, sparse teacher forcing (TF) has been suggested as an efficient control-theoretic method for dealing with exploding loss gradients when training ML models on chaotic DS. Here we incorporate this idea into a novel recurrent neural network (RNN) training framework for DS reconstruction based on multimodal variational autoencoders (MVAE). The forcing signal for the RNN is generated by the MVAE which integrates different types of simultaneously given time series data into a joint latent code optimal for DS reconstruction. We show that this training method achieves significantly better reconstructions on multimodal datasets generated from chaotic DS benchmarks than various alternative methods.

translated by 谷歌翻译